Hierarchical Scheduling of DAG Structured Computations on Manycore Processors with Dynamic Thread Grouping

نویسندگان

  • Yinglong Xia
  • Viktor K. Prasanna
  • James Li
چکیده

Many computational solutions can be expressed as directed acyclic graphs (DAGs) with weighted nodes. In parallel computing, scheduling such DAGs onto manycore processors remains a fundamental challenge, since synchronization across dozens of threads and preserving precedence constraints can dramatically degrade the performance. In order to improve scheduling performance on manycore processors, we propose a hierarchical scheduling method with dynamic thread grouping, which schedules DAG structured computations at three different levels. At the top level, a supermanager separates threads into groups, each consisting of a manager thread and several worker threads. The supermanager dynamically merges and partitions the groups to adapt the scheduler to the input task dependency graphs. Through group merging and partitioning, the proposed scheduler can dynamically adjust to become a centralized scheduler, a distributed scheduler or somewhere in between, depending on the input graph. At the group level, managers collaboratively schedule tasks for their workers. At the within-group level, workers perform self-scheduling within their respective groups and execute tasks. We evaluate the proposed scheduler on the Sun UltraSPARC T2 (Niagara 2) platform that supports up to 64 hardware threads. With respect to various input task dependency graphs, the proposed scheduler exhibits superior performance when compared with other various baseline methods, including typical centralized and distributed schedulers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sparse direct solvers with accelerators over DAG runtimes

The current trend in the high performance computing shows a dramatic increase in the number of cores on the shared memory compute nodes. Algorithms, especially those related to linear algebra, need to be adapted to these new computer architectures in order to be efficient. PASTIX* is a sparse parallel direct solver, that incorporates a dynamic scheduler for strongly hierarchical modern architec...

متن کامل

Thread-level priority assignment in global multiprocessor scheduling for DAG tasks

The advent of multiand many-core processors offers enormous performance potential for parallel tasks that exhibit sufficient intra-task thread-level parallelism. With a growth of novel parallel programming models (e.g., OpenMP, MapReduce), scheduling parallel tasks in the real-time context has received an increasing attention in the recent past. While most studies focused on schedulability anal...

متن کامل

An Efficient Thread Mapping Strategy for Multiprogramming on Manycore Processors

The emergence of multicore and manycore processors is set to change the parallel computing world. Applications are shifting towards increased parallelism in order to utilise these architectures efficiently. This leads to a situation where every application creates its desirable number of threads, based on its parallel nature and the system resources allowance. Task scheduling in such a multithr...

متن کامل

Scheduling Recurrent Precedence-Constrained Task Graphs on a Symmetric Shared-Memory Multiprocessor

We consider approaches that allow task migration for scheduling recurrent directed-acyclic-graph (DAG) tasks on symmetric, shared-memory multiprocessors (SMPs) in order to meet a given throughput requirement with fewer processors. Within the scheduling approach proposed, we present a heuristic based on grouping DAG subtasks for lowering the end-to-end latency and an algorithm for computing an u...

متن کامل

Scheduling Direct Acyclic Graphs on Massively Parallel 1K-core Processors

The era of manycore computing will bring new fundamental challenges that the techniques designed for single core processors will have to be dramatically changed to support the coming wave of extreme-scale computing with thousands of cores on a single processor. Today’s programming languages (e.g. C/C++, Java) are unlikely to scale to manycore levels. One approach to address such concurrency pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010